-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new(userspace/libsinsp): allow to not retrieve detailed user info #1765
new(userspace/libsinsp): allow to not retrieve detailed user info #1765
Conversation
c4d7e8a
to
824448e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the need, but i feel like a runtime configuration would be much better, like:
- change API to
void set_import_users(bool import_users, bool details=true);
- modify
scap_{userinfo,groupinfo}
structures so that all chars array become just char pointers - by default, all
sinsp_userinfo->{name,homedir,shell}
will default to point to the<NA>
static string ifdetails==false
This should need the same amount of changes and is runtime configurable. Of course we need to pass the details
bool down to scap, but since that information bit is already managed by sinsp_user
class, it should be pretty easy.
WDYT?
case TYPE_SHELL: | ||
RETURN_EXTRACT_CSTR(tinfo->m_user.shell); | ||
RETURN_EXTRACT_CSTR(tinfo->m_user->shell); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this crash when dereferencing m_user
that is nullptr (ie: when SINSP_USER_DETAILS
is undefined)?
@FedeDP thanks for looking into this!
I was thinking about this, but haven't found any use-case for switching it at runtime. Which one do you have in mind?
Not sure what do you mean by "information bit is already managed by sinsp_user"? My initial impression was that modifying |
I mean that
On a second thought, we also dump We might just create a
It's just that i'd avoid the build time complexity given it is not needed in this specific case. I can see Falco gaining an option to disable the gathering of user and group details in some specific scenarios (lots of threadinfos and thus lots of ram used by Falco itself). |
824448e
to
2df0a34
Compare
Ok, @FedeDP I've updated PR based on your feedback, does it look closer to what you had in mind? Probably the only difference is that if not requested, name/homedir/shell are still going to be empty, not "N/A", but it's on purpose -- I still can't force myself to waste even these few bytes :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you very much! This looks fine, left some comments and questions!
userspace/libsinsp/sinsp.h
Outdated
@@ -1125,6 +1142,7 @@ class SINSP_PUBLIC sinsp : public capture_stats_source | |||
bool m_isfatfile_enabled; | |||
bool m_isinternal_events_enabled; | |||
bool m_hostname_and_port_resolution_enabled; | |||
bool m_user_details_enabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd move this bool inside m_usergroup_manager
.
userspace/libsinsp/sinsp.h
Outdated
\param enable If set to false, no extended user information will be | ||
stored in sinsp_threadinfo, only user id/group id will be available. | ||
*/ | ||
void set_user_details(bool enable); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to have a single method set_import_users(bool import_users, bool with_details=true)
, instead of having 2 different APIs.
userspace/libsinsp/threadinfo.h
Outdated
{ | ||
uint32_t uid; ///< User ID | ||
uint32_t gid; ///< Group ID | ||
std::string name; ///< Username |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am not sure whether using string is what we want here; but perhaps it works fine (ie: have you already tested memory consumption with this change?)
Anyway, storing <NA>
when details are not enabled should not cost that much, right? Returning an empty string breaks UX since we always used NA
for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've re-run memory test with this change, getting the same results. From what I understand, an emptystring
doesn't introduce as much overhead as statically allocated array anyway (although it might be larger than some alternatives).
Let me do couple more experiments to estimate how much exactly an overhead from using string and initializing it with NA would be. Another option to this could be to turn this structure into a class with a method getName
, that will return NA if the underlying string is empty. This would seem to satisfy both memory and UX sides, but will involve more changes. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I've re-run memory test with this change, getting the same results. From what I understand, and emptystring doesn't introduce as much overhead as statically allocated array anyway (although it might be larger than some alternatives).
+1 thanks!
Another option to this could be to turn this structure into a class with a method getName, that will return NA if the underlying string is empty.
Let's first try if storing <NA>
in the string causes overhead, otherwise yep this would be an ok solution for me!
Thank you very much for the deep dive btw!
2df0a34
to
02fb52f
Compare
02fb52f
to
08b0500
Compare
@FedeDP here is what I came up with to address the latest feedback. Essentially everything as discussed, except few unexpected changes in |
Sorry for the noise! |
/milestone 0.16.0 |
I think you need to fix chisels build :) Shoudl be pretty straightforward! |
The failure comes from the |
875b652
to
4513e73
Compare
Yeah, I'm going to try to simulate exactly the same build procedure as in this test, maybe that would help. |
Perhaps @therealbobo can help us here? He was the first working on the emiscripten build integration if i recall correctly! 🙏 |
@FedeDP, @therealbobo Yeah, that would be great. I can reproduce the issue, but it's extremely weird and I can't come up with any explanation how it could be connected to the PR itself. What I observe is that It seems that to trigger the issue it's enough to add only the subclasses |
Am not able to test on my machine because my nodejs version is throwing some errors :/
|
Sorry for the late response! I gave a quick look and after some search I tried with success the following fix:
The problem is that the linker option `-sSTACK_SIZE' is not present in the emscripten version of the CI AFAIK but I could be wrong. Probably 5MB is overkill and we can settle with less. |
Thanks bobo! What i don't get is why that CI is only failing on this PR though. |
@erthalion i also noticed that some |
My pr was merged! Care to rebase? Thanks! |
sinsp_threadinfo contains two fields with user and login_user information. Since those fields are of scap_userinfo type and statically allocated, they take a lot of space: scap_userinfo m_user; /* 368 2312 */ scap_userinfo m_loginuser; /* 2680 2312 */ which is 4624 bytes out of 5728 for the whole sinsp_threadinfo: /* size: 5728, cachelines: 90, members: 64 */ Most of this memory is coming from the fields name (MAX_CREDENTIALS_STR_LEN), homedir and shell (both SCAP_MAX_PATH_SIZE). For a process-heavy workload this can mean a lot of memory taken for these purposes. To make memory management more flexible, split m_user/m_loginuser into two set of fields: * one containing uid/gid, which are ubiquitous and generally used everywhere * one for the rest of heavy details, which are needed less often The new sinsp_userinfo class is not supposed to use separately from sinsp_threadinfo, thus it's defined inside the class. Co-authored-by: Mauro Ezequiel Moltrasio <mmoltras@redhat.com> Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
4513e73
to
8f2e7d6
Compare
Thanks @therealbobo @FedeDP for investigating, this failure was really driving me crazy :) |
Yay!!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
LGTM label has been added. Git tree hash: 33363019642517d7835bc29861129eef6f092a44
|
@leogr @jasondellaluce I would appreciate some final review and an approval if everything is fine :) |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: erthalion, FedeDP, leogr The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
After this PR falcosecurity/libs#1765 we are now able to correctly retrieve again user name info. For this reason we need to update the triggered rules Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
After this PR falcosecurity/libs#1765 we are now able to correctly retrieve again user name info. For this reason we need to update the triggered rules Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
After this PR falcosecurity/libs#1765 we are now able to correctly retrieve again user name info. For this reason we need to update the triggered rules Signed-off-by: Andrea Terzolo <andreaterzolo3@gmail.com>
What type of PR is this?
/kind feature
Any specific area of the project related to this PR?
/area libsinsp
Does this PR require a change in the driver versions?
sinsp_threadinfo
contains two fields withuser
andlogin_user
information. Since those fields are of
scap_userinfo
type and staticallyallocated, they take a lot of space:
which is 4624 bytes out of 5728 for the whole sinsp_threadinfo:
Most of this memory is coming from the fields
name
(
MAX_CREDENTIALS_STR_LEN
),homedir
andshell
(bothSCAP_MAX_PATH_SIZE
).For a process-heavy workload this can mean a lot of memory taken for
these purposes, as in the example below (max possible threadinfo cache size):
The memory usage breakdown shows that most of it goes to manage
sinsp_threadinfo
(where_M_allocate_node
is a part of threadinfo allocationprocess, e.g. in
get_thread_ref
):To make memory management more flexible, split
m_user
/m_loginuser
intotwo set of fields:
everywhere
The new user_details structure is not supposed to use separately from
sinsp_threadinfo, thus it's defined inside the class. Allow to configure
whether to use the full or slim version of threadinfo via compile time
configuration.
Which issue(s) this PR fixes:
Special notes for your reviewer:
Does this PR introduce a user-facing change?: